22 research outputs found

    Generalized empirical Bayesian methods for discovery of differential data in high-throughput biology

    Get PDF
    Motivation: High-throughput data are now commonplace in biological research. Rapidly changing technologies and application mean that novel methods for detecting differential behaviour that account for a ‘large P, small n’ setting are required at an increasing rate. The development of such methods is, in general, being done on an ad hoc basis, requiring further development cycles and a lack of standardization between analyses. Results: We present here a generalized method for identifying differential behaviour within high-throughput biological data through empirical Bayesian methods. This approach is based on our baySeq algorithm for identification of differential expression in RNA-seq data based on a negative binomial distribution, and in paired data based on a beta-binomial distribution. Here we show how the same empirical Bayesian approach can be applied to any parametric distribution, removing the need for lengthy development of novel methods for differently distributed data. Comparisons with existing methods developed to address specific problems in high-throughput biological data show that these generic methods can achieve equivalent or better performance. A number of enhancements to the basic algorithm are also presented to increase flexibility and reduce computational costs. Availability and implementation: The methods are implemented in the R baySeq (v2) package, available on Bioconductor http://www.bioconductor.org/packages/release/bioc/html/baySeq.html. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.This work was supported by European Research Council Advanced Investigator Grant ERC-2013-AdG 340642 – TRIBE.This is the author accepted manuscript. The final version is available from Oxford University Press via http://dx.doi.org/10.1093/bioinformatics/btv56

    Improving Energy-Efficiency of Multicores using First-Order Modeling

    No full text
    In the recent decades, power consumption has evolved to one of the most critical resources in a computer system. In the form of electricity bill in data centers, battery life in mobile devices, or thermal constraints in desktops and laptops, power consumption imposes several limitations in today’s processors and improving power and energy efficiency is one of the most urgent research topics of Computer Architecture. Dynamic Voltage and Frequency Scaling (DVFS) and Cache Resizing are among the most popular energy saving techniques. Previous work, however, has focused on developing heuristics and trial-and-error methods that yield acceptable savings, but fail to provide insight and understanding of how these techniques affect power and performance of a computer system. In contrast, this Thesis proposes the use of first-order modeling to improve the energy efficiency of computer systems. A first-order model needs to be (i) accurate enough to efficiently drive DVFS and Cache Resizing decisions, and (ii) simple enough to eliminate the overhead of collecting the required inputs to the model. We show that such models can be constructed and successfully applied in modern systems. For DVFS, we propose to scale frequency down to exploit applications’ memory slack, i.e., periods that the processor spends waiting for data to be fetched from the main memory. In such cases, the processor frequency can be scaled down to save energy without inordinate performance penalty. Our DVFS models can detect slack and predict the impact of DVFS in both power and performance with great accuracy. Cache Resizing, on the other hand, relies on the fact that many applications do not benefit from the vast amount of cache that modern processors are equipped with. In such cases, the cache can be resized to save static energy consumption at limited performance cost. Since both techniques are related with the memory behavior of applications, we propose a unified model to manage the two techniques in tandem and maximize energy efficiency through synergistic DVFS and Cache Resizing. Finally, our experience with DVFS in real systems motivated us to contribute to the integration of DVFS into the gem5 simulator. Unlike other simulators that ignore the role of OS in DVFS, we extend the gem5 simulator by developing the hardware and software components that allow existing Linux DVFS infrastructure to be seamlessly integrated in the simulator.UPMAR

    Αυτόματος συντονισμός οντολογιών για την ίδια θεματική περιοχή με έμφαση στην αντιστοίχιση και συγχώνευση αυτών

    No full text
    Ιn recent years the phenomenon of information overload has arisen: information is continuously growing, being provided in various forms and stored in decentralized/distributed systems, that range from inter and intra organization systems to those operating over the World Wide Web. Ontologies is a key technology for information engineers to shape information by formalizing agreed conceptualizations in specific domains. The aim is to enhance the proper manipulation of the existing information, in an attempt to deal with the overload information phenomenon. Still, although ontologies provide formal and unambiguous representations of domain conceptualizations, it would be a surprise if two independent parties had constructed the same ontology even for the same domain. Such a situation in an open and anonymous environment such as the World Wide Web is very common. Interoperability can be admittedly achieved through reaching an agreement, by producing a single and well-agreed ontology, or by aligning ontologies. This thesis focuses on the ontology alignment area of research. Specifically, this thesis makes the following three contributions: 1. Current state of the art methods exploit “surface features” of the ontologies, such as labels of elements, instances of concepts, and the structure of the ontologies, for the computation of equivalence mapping relations. This is achieved through various techniques that these methods employ. For example, a method may map concepts with similar labels, or with similar defined properties. In this thesis we take a step further and try to generate “latent features” (based on the features of the source and target ontologies), which are not directly present in the ontologies, but can be utilized for a more precise representation of the ontological elements. Towards this end, the method utilizes Probabilistic Topic Models, for the generation of representative features (i.e. “latent features”) for the representation of ontology elements. 2. We propose the CSR method for the location of subsumption mapping relations between elements of different ontologies, by utilizing supervised machine learning techniques. Currently, although the usefulness of subsumption mappings is known to the ontology alignment community the vast majority of the state of the art methods focus on the computation of equivalence mappings, and only few of them aim to ordered mappings, such as subsumption mappings. 3. It is a common practice of the vast majority of mapping systems, to be composed of numerous individual mapping methods. Each method locates its own mapping pairs, by utilizing its own logic and then the final result is produced by the synthesis of the results of all individual methods. The problem of synthesizing different ontology alignment methods is considered an open and vital issue in the ontology alignment community. Towards this end, we propose a “model-based synthesis” method, which addresses this problem by maximizing the social welfare within a group of interacting agents (i.e. maximizing the sum of utilities of individual agents). Each agent is responsible for making a mapping decision concerning a specific ontology element using a specific mapping method. Agents need to reach an agreement to the mapping of the ontology elements, consistently to the semantics of specifications and according to their preferences, which are being provided by the mapping methods.Οι οντολογίες έχουν αναδειχτεί ως σημαντική τεχνολογία για την διαχείριση της πληροφορίας. Παρόλο που οι οντολογίες προσφέρουν ένα φορμαλιστικό τρόπο αναπαράστασης γνώσης, παρατηρούνται διαφορετικές οντολογίες, προερχόμενες από ανεξάρτητες ομάδες, να περιγράφουν ουσιαστικά την ίδια πληροφορία, κοινής θεματικής περιοχής. Για να επιτευχθεί διαλειτουργικότητα είναι απαραίτητη η παραγωγή μιας κοινά αποδεκτής οντολογίας ή ο ορισμός αντιστοιχιών μεταξύ των στοιχείων των πηγαίων οντολογιών των εκάστοτε ομάδων. Ο Συντονισμός Οντολογιών είναι ιδιαίτερης σημασίας προς αυτή την κατεύθυνση. Πιο συγκεκριμένα, η παρούσα διατριβή εστιάζει στα τρία ακόλουθα σημεία: 1. Σε αντίθεση με την πλειοψηφία των υπαρχόντων μεθόδων συντονισμού οντολογιών, που αξιοποιούν "επιφανειακά χαρακτηριστικά" των οντολογιών για τον εντοπισμό αντιστοιχίσεων ισοδυναμίας, προτείνεται μια μέθοδος που παράγει και αξιοποιεί "λανθάνοντα χαρακτηριστικά" για την ακριβέστερη περιγραφή των εννοιών των οντολογιών. Αυτό επιτυγχάνεται με την χρήση Πιθανοτικών Θεματικών Μοντέλων. 2. Σε αντίθεση με την πλειοψηφία των υπαρχόντων μεθόδων συντονισμού οντολογιών, που εστιάζουν στον εντοπισμό αντιστοιχίσεων ισοδυναμίας, προτείνεται μέθοδος για τον εντοπισμό αντιστοιχίσεων υπαγωγής, με την χρήση τεχνικών Επιβλεπόμενης Μηχανικής Μάθησης. 3. Αποτελεί κοινή πρακτική των συστημάτων συντονισμού οντολογιών η χρήση πληθώρας διαφορετικών μεθόδων, με το τελικό αποτέλεσμα να παράγετε από την σύνθεση των αποτελεσμάτων τους. Στην διατριβή αυτή προτείνεται μέθοδος που επιλύει το πρόβλημα της σύνθεσης, σαν ένα πρόβλημα μεγιστοποίησης της συνολικής ευμάρειας σε ένα σύνολο αλληλεπιδρώντων πρακτόρων

    Towards Power Efficiency on Task-Based, Decoupled Access-Execute Models

    No full text
    This work demonstrates the potential of hardware and software optimization to improve theeffectiveness of dynamic voltage and frequency scaling (DVFS). For software, we decouple data prefetch (access) and computation (execute) to enable optimal DVFS selectionfor each phase. For hardware, we use measurements from state-of-the-art multicore processors to accurately model the potential of per-core, zero-latency DVFS. We demonstrate that the combinationof decoupled access-execute and precise DVFS has the potential to decrease EDP by 25-30% without reducing performance. The underlying insight in this work is that by decoupling access and execute we can take advantageof the memory-bound nature of the access phase and the compute-bound nature of the execute phase to optimize power efficiency. For the memory-bound access phase, where we prefetch data into the cachefrom main memory, we can run at a reduced frequency and voltage without hurting performance. Thereafter, the execute phase can run much faster, thanks to the prefetching of the access phase, and achieve higher performance. This decoupled program behavior allows us to achieve more effective use of DVFS than standard coupled executions which mix data access and compute. To understand the potential of this approach, we measure application performance and power consumption on a modern multicore system across a range of frequencies and voltages. From this data we build a model that allows us to analyze the effects of per-core, zero-latency DVFS. The results of this work demonstrate the significant potential for finer-grain DVFS in combination with DVFS-optimized software

    Towards more efficient execution : a decoupled access-execute approach

    No full text
    The end of Dennard scaling is expected to shrink the range of DVFS in future nodes, limiting the energy savings of this technique. This paper evaluates how much we can increase the effectiveness of DVFS by using a software decoupled access-execute approach. Decoupling the data access from execution allows us to apply optimal voltage-frequency selection for each phase and therefore improve energy efficiency over standard coupled execution. The underlying insight of our work is that by decoupling access and execute we can take advantage of the memory-bound nature of the access phase and the compute-bound nature of the execute phase to optimize power efficiency, while maintaining good performance. To demonstrate this we built a task based parallel execution infrastructure consisting of: (1) a runtime system to orchestrate the execution, (2) power models to predict optimal voltage-frequency selection at runtime, (3) a modeling infrastructure based on hardware measurements to simulate zero-latency, per-core DVFS, and (4) a hardware measurement infrastructure to verify our model's accuracy. Based on real hardware measurements we project that the combination of decoupled access-execute and DVFS has the potential to improve EDP by 25% without hurting performance. On memory-bound applications we significantly improve performance due to increased MLP in the access phase and ILP in the execute phase. Furthermore we demonstrate that our method can achieve high performance both in presence or absence of a hardware prefetcher.LPGPU FP7-ICT-288653UPMAR

    Towards Power Efficiency on Task-Based, Decoupled Access-Execute Models

    No full text
    This work demonstrates the potential of hardware and software optimization to improve theeffectiveness of dynamic voltage and frequency scaling (DVFS). For software, we decouple data prefetch (access) and computation (execute) to enable optimal DVFS selectionfor each phase. For hardware, we use measurements from state-of-the-art multicore processors to accurately model the potential of per-core, zero-latency DVFS. We demonstrate that the combinationof decoupled access-execute and precise DVFS has the potential to decrease EDP by 25-30% without reducing performance. The underlying insight in this work is that by decoupling access and execute we can take advantageof the memory-bound nature of the access phase and the compute-bound nature of the execute phase to optimize power efficiency. For the memory-bound access phase, where we prefetch data into the cachefrom main memory, we can run at a reduced frequency and voltage without hurting performance. Thereafter, the execute phase can run much faster, thanks to the prefetching of the access phase, and achieve higher performance. This decoupled program behavior allows us to achieve more effective use of DVFS than standard coupled executions which mix data access and compute. To understand the potential of this approach, we measure application performance and power consumption on a modern multicore system across a range of frequencies and voltages. From this data we build a model that allows us to analyze the effects of per-core, zero-latency DVFS. The results of this work demonstrate the significant potential for finer-grain DVFS in combination with DVFS-optimized software
    corecore